A Contig-Based Strategy for the Genome-Wide Discovery of MicroRNAs without Complete Genome Resources
نویسندگان
چکیده
MicroRNAs (miRNAs) are important regulators of many cellular processes and exist in a wide range of eukaryotes. High-throughput sequencing is a mainstream method of miRNA identification through which it is possible to obtain the complete small RNA profile of an organism. Currently, most approaches to miRNA identification rely on a reference genome for the prediction of hairpin structures. However, many species of economic and phylogenetic importance are non-model organisms without complete genome sequences, and this limits miRNA discovery. Here, to overcome this limitation, we have developed a contig-based miRNA identification strategy. We applied this method to a triploid species of edible banana (GCTCV-119, Musa spp. AAA group) and identified 180 pre-miRNAs and 314 mature miRNAs, which is three times more than those were predicted by the available dataset-based methods (represented by EST+GSS). Based on the recently published miRNA data set of Musa acuminate, the recall rate and precision of our strategy are estimated to be 70.6% and 92.2%, respectively, significantly better than those of EST+GSS-based strategy (10.2% and 50.0%, respectively). Our novel, efficient and cost-effective strategy facilitates the study of the functional and evolutionary role of miRNAs, as well as miRNA-based molecular breeding, in non-model species of economic or evolutionary interest.
منابع مشابه
I-40: Male Genome Programming, Infertility and Cancer
Background: During male germ cells differentiation, genomewide re-organizations and highly specific programming of the male genome occur. These changes not only include the large-scale meiotic shuffling of genes, taking place in spermatocytes, but also a complete “re-packaging” of the male genome in post meiotic cells, leading to a highly compacted nucleo-protamine structure in the mature sperm...
متن کاملPapaya Dieback in Malaysia: A StepTowards A New Insight of Disease Resistance
A recently published article describing the draft genome of Erwiniamallotivora BT-Mardi (1), the causal pathogen of papaya dieback infection in Peninsular Malaysia, hassignificant potential to overcome and reduce the effect of this vulnerable crop (2). The authors found that the draft genome sequenceis approximately 4824 kbp and the G+C content of the genomewas 52-54%, which is very similarto t...
متن کاملGeneration of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding
Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs tog...
متن کاملA Simple Genome Walking Strategy to Isolate Unknown Genomic Regions Using Long Primer and RAPD Primer
Background: Genome walking is a DNA-cloning methodology that is used to isolate unknown genomic regions adjacent to known sequences. However, the existing genome-walking methods have their own limitations. Objectives: Our aim was to provide a simple and efficient genome-walking technology. Material and Methods: In this paper, we dev...
متن کاملGenome-wide computational prediction of miRNAs in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) revealed target genes involved in pulmonary vasculature and antiviral innate immunity
The current outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)in China threatened humankind worldwide. The coronaviruses contains the largest RNA genome among all other known RNA viruses, therefore the disease etiology can be understood by analyzing the genome sequence of SARS-CoV-2. In this study, we used an ab-intio based computational tool VMir to scan the complete geno...
متن کامل